| Severity | Dataset | Precision | NNS | Balanced Precision | Recall | Specificity | TP | FP | TN | FN |
|---|---|---|---|---|---|---|---|---|---|---|
| All-cause | Full | 0.33 | 3.0 | 0.87 | 0.12 | 0.98 | 2 | 4 | 222 | 15 |
| Attributable | Full | 0.25 | 4.0 | 0.94 | 0.20 | 0.99 | 1 | 3 | 227 | 4 |
| Pragmatic | Full | 0.29 | 3.5 | 0.87 | 0.15 | 0.98 | 2 | 5 | 225 | 11 |
| All-cause | Intersection | 0.00 | Inf | 0.00 | 0.00 | 0.99 | 0 | 1 | 188 | 9 |
| Attributable | Intersection | 0.17 | 6.0 | 0.88 | 0.20 | 0.97 | 1 | 5 | 188 | 4 |
Predicting C. difficile Infection Severity from the Taxonomic Composition of the Gut Microbiome
Introduction
prevalance of cdi. prevalance of severe cdi outcomes.
Numerous studies indicate that the gut microbiome may play a role in C. diff colonization, infection, and clearance.
prediction models based on EHR for whether infection occurs in the first place already in use. so how about predicting severity of infections to guide treatment? OTUs vs EHRs to predict severity.
A few ways to define CDI severity (Figure 1).
The IDSA definition is known to be a poor predictor of adverse outcomes (Stevens et al. 2020), however, it is easy to collect.
new dataset.
Two goals: investigate whether we can predict CDI severity based on OTU data to inform how the gut microbiome may modulate severity (ML-based science: good performance implies something about underlying biology), and determine whether there is potential clinical value in OTU-based models.
Results
Model performance
We first set out to train the best models possible for each severity definition. Not all samples have labels available for all four severity definitions due to missing data for some patient lab values and incomplete chart review (TODO table).
However, comparisons across these definitions is not fair when using different subsets of the data for each definition. To better compare the model performances across different severity definitions, we selected the intersection of samples (993) that had labels for all four severity definitions and repeated the model training and evaluation process.
Report median AUROC for training set and test set, and median AUBPRC for test set (Figure 2). Nearly all pairs of definitions have significantly different performances on the test set (P < 0.05) except for AUROC and AUBPRC of Attributable vs. Pragmatic on the intersection dataset (as they are identical), AUROC of Attributable vs. All-cause on the full dataset, and AUROC of Attributable vs. IDSA on the full dataset.
Feature importance
We performed permutation feature importance to determine which OTUs contributed the most to model performance (Figure 3).
Estimating clinical benefit
table of NNS at 95th pct of risk for each outcome x dataset (Table 1).
NNT x NNS = NNB
NNT for Fidaxomicin, FMT, and/or Bezlotoxumab. Current standard is Vancomycin because it’s cheaper than Fidaxomicin, even though IDSA recommends Fidaxomicin.
pragmatic vs attributable severity. table?
rough estimate of costs. current: everyone gets vancomycin. potential: patients flagged as severe get fidaxomicin. based on NNB, estimate how much money saved in averting severe outcomes.
decision curve analysis.
Discussion
Compare to EHR-based models.
Discuss important OTUs. which ones concord with literature, which ones may be new.
Amplicon sequencing is not typically performed for CDI patients, but if there is clinical value to be gained by implementing OTU-based models, we could make a case to routinely sequence and profile the microbial communities of CDI patients.
Materials and Methods
Sample collection
This study was approved by the University of Michigan Institutional Review Board. All patient samples were collected by the University of Michigan Health System from January 2016 through December 2017. Stool samples that had unformed stool consistency were tested for C. difficile by the clinical microbiology lab with a two-step algorithm that included detection of C. difficile glutamate dehydrogenase and toxins A and B by enzyme immunoassay with reflex to PCR for the tcdB gene when results were discordant. 1,517 stool samples were collected from patients diagnosed with a CDI. Leftover stool samples that were sent to the clinical microbiology lab were collected and split into different aliquots. For 16S sequencing, the aliquot of stool was resuspended in DNA genotek stabilization buffer and then stored in the -80°C freezer. Only the first CDI sample per patient was used for subsequent ML analyses such that no patient is represented more than once, resulting in a dataset of 1,277 samples.
16S rRNA gene amplicon sequencing
Samples stored in DNA genotek buffer were thawed from the -80°C, vortexed, and then transferred to a 96-well bead beating plate for DNA extractions. DNA was extracted using the DNeasy Powersoil HTP 96 kit (Qiagen) and an EpMotion 5075 automated pipetting system (Eppendorf). The V4 region of the 16S rRNA gene was amplified with the AccuPrime Pfx DNA polymerase (Thermo Fisher Scientific) using custom barcoded primers, as previously described (Kozich et al. 2013). Each library preparation plate for sequencing contained a negative control (water) and mock community control (ZymoBIOMICS microbial community DNA standards). The PCR amplicons were normalized (SequalPrep normalizatin plate kit from Thermo Fisher Scientific), pooled and quantified (KAPA library quantification kit from KAPA Biosystems), and sequenced with the MiSeq system (Illumina).
All sequences were processed with mothur (v1.46) using the MiSeq SOP protocol (Schloss et al. 2009; Kozich et al. 2013). Paired sequencing reads were combined and aligned with the SILVA (v132) reference database (Quast et al. 2013) and taxonomy was assigned with a modified version of the Ribosomal Database Project reference sequences (v16) (Cole et al. 2014). Sequences were clustered into de novo OTUs with the OptiClust algorithm in mothur (Westcott and Schloss 2017), resulting in Samples were rarefied to 5,000 sequences per sample, repeated 1,000 times for alpha and beta diversity analysis.
Defining CDI severity
We explore four different ways to define CDI cases as severe or not. The IDSA definition of severe CDI is based on lab values collected on the day of diagnosis, with a case being severe if serum creatinine level is greater than or equal to \(1.5 mg/dL\) and the white blood cell count is greater than or equal to \(15 k/\mu L\) (mcdonald_clinical_2018?). The remaining definitions focus on the occurrence of adverse outcomes, which may be more clinically relevant. The all-cause severity definition defines a case as severe if ICU admission, colectomy, or death occurs within 30 days of CDI diagnosis, regardless of the cause of the adverse event. The attributable severity definition is based on disease-related complications defined by the CDC, where an adverse event of ICU admission, colectomy, or death occurs within 30 days of CDI diagnosis, and the adverse event is determined to be attributable to the CDI by physician chart review (McDonald et al. 2007). Finally, we defined a pragmatic severity definition that makes use of the attributable definition when available and falls back to the all-cause definition when chart review has not been completed, allowing us to use as many samples as we have available while taking physicians’ expert opinions into account where possible.
Model training and evaluation
Random forest models were used to examine whether OTU data collected on the day of diagnosis could classify CDI cases as severe according to four different definitions of severity. We used the mikropml R package v1.5.0 (Topçuoğlu et al. 2021) implemented in a custom version of the mikropml Snakemake workflow (Sovacool et al. 2023) for all steps of the machine learning analysis. We have a full dataset which uses all samples available for each severity definition, and an intersection dataset which consists of only the samples that have all four definitions labelled. The intersection dataset is the most fair for comparing model performance across definitions, while the full dataset allows us to use as much data as possible for model training and evaluation. Datasets were preprocessed with the default options in mikropml to remove features with near-zero variance and scale continuous features from -1 to 1. During preprocessing, 9,757 to 9,760 features were removed due to having near-zero variance, resulting in datasets ranging from 179 to 182 depending on the severity definition. We randomly split the data into an 80% training and 20% test set and repeated this 100 times, followed by training models with 5-fold cross-validation. Model performance was calculated on the test set using the area under the receiver-operator characteristic curve (AUROC) and the area under the balanced precision-recall curve (AUBPRC). Permutation importance was then performed to determine which OTUs contributed most to model performance. We reported OTUs with a significant permutation test in at least 75 of the 100 models.
Since the severity labels are imbalanced with different frequencies of severity for each definition, we calculated balanced precision, the precision expected if the labels were balanced. The balanced precision and the area under the balanced precision-recall curve (AUBPRC) were calculated with Equations 1 and 7 from Wu et al. (2021).
Code availability
The complete workflow, code, and supporting files required to reproduce this manuscript with accompanying figures is available at https://github.com/SchlossLab/severe-CDI.
The workflow was defined with Snakemake (Köster and Rahmann 2012) and dependencies were managed with conda environments. Scripts were written in R (R Core Team 2020), Python (Van Rossum and Drake 2009), and GNU bash. Additional software and packages used in the creation of this manuscript include cowplot (Wilke 2020a), ggtext (Wilke 2020b), ggsankey (Sjoberg 2022), schtools (Sovacool, Lesniak, and Schloss 2022), the tidyverse metapackage (Wickham et al. 2019), Quarto, and vegan (Oksanen et al. 2023).
Data availability
The 16S rRNA sequencing data have been deposited in the National Center for Biotechnology Information Sequence Read Archive (BioProject Accession no. PRJNA729511).
Acknowledgements
TODO